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ABSTRACT 

Support  Vector  Machines  (SVMs)  have  become  popular  due  to  their  accuracy  in  classifying  sparse  data 
sets.  Their  computational  time  can  be  virtually  independent  of  the  size  of  the  feature  vector.  SVMs  have 
been  shown  to  out  perform  other  learning  machines  on  many  data  sets.  In  this  paper,  we  use  SVMs  to 
detect  a  car  in  a  lane  of  traffic.  Digital  pictures  of  various  driving  situations  are  used.  The  results  from  the 
SVM  algorithm  are  compared  to  results  from  a  standard  neural  network  approach. 
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1.  INTRODUCTION 

Support  vector  machines  (SVMs)  are  wide  margin  classifies  that  solve  a  quadratic  programming  problem  to 
find  the  maximum  separation  between  classes1'4.  The  algorithm  is  applied  to  the  Tank  Automotive 
Research,  Development,  and  Engineering  Center  (TARDEC)  car/lane  image  set.  The  image  set  was 
obtained  using  a  digital  camera  mounted  on  a  vehicle  dash.  The  images  are  composed  of  views  of  either  a 
clear  road  ahead  or  a  car  in  front  of  the  camera  at  various  distances.  A  simulation  experiment  is  performed 
to  determine  how  well  SVMs  can  do  in  warning  a  driver  when  a  vehicle  is  in  front  of  the  car.  The  SVM 
results  are  compared  with  results  from  a  standard  neural  network  approach. 

Section  2  describes  the  different  methods  used  to  process  the  images  to  find  a  good  feature  vector.  Section 
3  gives  the  results  of  the  study.  Section  4  describes  other  methods  that  might  warrant  further  investigation 
into  solving  this  problem.  For  detailed  information  on  SVMs  or  neural  networks,  the  reader  is  advised  to 
consult  the  references. 


2.  IMAGE  PRE-PROCESSING  METHODS 


Various  techniques  were  investigated  to  find  a  feature  vector  that  described  the  data  set.  Methods  such  as 
wavelets,  masks,  and  histograms  were  explored  with  some  having  success  and  others  not.  This  section 
describes  the  thoughts  behind  the  investigation  and  the  results  that  the  methods  gave. 

Data  was  collected  using  several  different  digital  cameras  mounted  on  the  dash  of  various  cars.  The  images 
were  colored  with  sizes  ranging  from  1280x1024  to  640x512.  Pictures  were  taken  of  common  road  surfaces 
(dirt,  highway,  freeway,  etc.)  with  either  cars  at  different  distances  or  no  cars.  Before  processing,  each 
image  was  converted  to  grayscale  and  resized  using  the  nearest  neighbor  algorithm5  to  a  standard  size.  The 
investigation  into  creating  a  good  feature  vector  was  centered  on  finding  edges  of  the  cars.  Unfortunately, 
isolating  the  edges  around  the  car  turned  out  to  be  difficult  due  to  other  objects  in  the  pictures  having  more 
prominent  features. 
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Figure  la  -  Image 


Figure  lb  -  Sobel  mask  with 
threshold  =  .2 
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Figure  lc  - 

Sobel  mask  with 

threshold  =  .4 

Figure  Id  -  Sobel  mask  with 
threshold  =  .6 


Figure  1  -  original  image  grayed  and  resized  (a).  Sobel  mask  taken  of  images  and  thresholded  (b,  c,  d). 


Figure  la  is  an  example  of  a  typical  grayscale  image.  A  Sobel  mask5'7  is  applied  and  the  results  are 
thresholded  to  find  the  edges  (Figures  lb  -  Id).  Figure  1  shows  that  increasing  the  threshold  results  in 
decreasing  the  edges  of  the  car.  To  reduce  noise  in  the  image  without  losing  too  much  of  the  car  outline, 
another  algorithm  must  be  used. 

Applying  the  wavelet  transform8'10  to  the  image  (or  the  edges  of  the  image)  works  well  if  the  intent  is  to  use 
the  approximation  from  the  wavelet  transform  to  resize  it.  However,  local  statistics  from  the  wavelet 
transform  (such  as  mean  and  energy)  do  not  increase  the  classification  rate.  Instead,  a  look  at  the  data 
shows  that  the  center  region  is  the  place  where  there  is  a  car  (for  pictures  that  had  cars).  We  can  reduce  the 
image  by  cutting  out  the  middle  128x128  section.  Doing  so  reduces  the  unwanted  edges  caused  by 
background  objects.  However,  this  method  often  cuts  out  edges  of  vehicles  that  are  too  close  or  off-center. 

Another  processing  technique  that  was  investigated  was  to  use  different  size  boxes  around  the  center,  called 
the  box-in-box  method".  Box  sizes  were  16x16,  32x32,  64x64,  128x128,  and  256x256.  These  boxes 


contained  the  Sobel  edges  of  the  image  in  that  region.  The  goal  was  to  encompass  the  full  car  outline 
while  reducing  the  added  noise  of  the  edges  due  to  the  background.  Each  box  represented  a  distance  the 
car  was  away  from  the  camera  (for  images  with  cars).  A  number  of  boxes  were  trained  (on  both  car  and  no 
car  images)  and  tested.  If  one  of  the  boxes  showed  that  a  car  outline  was  present  then  the  classifier 
classified  the  image  as  having  a  car  in  front.  Unfortunately,  this  method  did  not  classify  well  due  to 
different  car  styles,  patches  or  glares  in  the  roads,  off-centered  cars,  bridges,  etc. 

The  final  feature  vector  was  developed  using  a  few  of  the  techniques  described  above,  along  with  an 
algorithm  to  find  horizontal  lines12.  Since  the  edges  of  each  car  are  not  consistent  with  one  another,  a  new 
way  to  view  the  images  needed  to  be  looked  at.  The  car  edges  contain  a  number  of  horizontal  lines  coming 
from  the  bumper,  rear  window,  top,  bottom,  etc.  When  looking  at  edges  in  no  car  images,  the  horizon  and 
tree  lines  also  produced  horizontal  lines,  so  they  had  to  be  sectioned  off. 

A  Sobel  mask  was  applied  to  find  the  edges  in  the  center  128x128  area  of  the  images.  After  this,  an 
algorithm  was  applied  to  find  consecutive  horizontal  pixels  at  least  6  pixels  in  length.  Examples  of  the 
processing  are  shown  in  Figure  2  (for  a  car)  and  Figure  3  (for  no  car). 


Figure  2a  -  image  of  a  car 


Figure  2b  -  center  regions  horizontal  lines  from  edges 


Figure  2  -  image  of  a  car  (a)  and  the  center  horizontal  lines  (b) 


Figure  3a  -  image  of  no  car 


Figure  3  -  Image  of  no  car  (a)  and  the  center  regions  horizontal  lines  (b) 


From  this,  the  length  of  the  lines  were  grouped  (based  on  the  number  of  consecutive  pixels)  and  a 
histogram  was  formed12.  The  line  groups  were  6-8,  9-11,  12-14,  15-17,  18-20,  21+.  These  six  numbers 
along  with  the  total  number  of  horizontal  lines  (at  least  6  pixels  long)  were  used  as  the  final  feature  vector. 


3.  RESULTS 


There  were  a  total  of  218  images  with  89  car  images  and  129  no  car  images.  Each  image  was  represented 
by  seven  elements  (see  above).  The  feature  data  was  split  up  into  training  and  test  vectors  and  put  into  a 
support  vector  machine  (SVM)3  and  a  standard  neural  network  (NN)1'.  The  SVM  used  a  quadratic 
polynomial  kernel.  The  NN  has  two  layers  with  the  hidden  layer  having  either  three  or  five  neurons  and 
the  outer  layer  having  one  neuron.  The  activation  function  for  all  neurons  is  a  unipolar  sigmoid  function. 
The  results  are  shown  in  the  Tables  1  and  2  below: 


Trained  50  car  and  50  no  car: 

SVM  -  polv  2  kernel 

Classified  as  a  car 

Classified  as  no  car 

Actual  Car 

38 

1 

Actual  No  Car 

4 

75 

Table  la 

NN  -3  hidden  layer  neurons 

Classified  as  a  car 

Classified  as  no  car 

Actual  Car 

38 

1 

Actual  No  Car 

6 

73 

Table  lb 

NN  -  5  hidden  layer  neurons 

Classified  as  a  car 

Classified  as  no  car 

Actual  Car 

38 

1 

Actual  No  Car 

6 

73 

Table  lc 

Table  1  -  SVM  and  NN  classification  matrix  for  100  training  samples 


Calculating  the  classification  rate  for  the  SVM  of  Table  la,  we  see  that  it  chooses  the  correct  class  95.8% 
of  the  time.  Both  NNs  are  the  same  with  a  classification  rate  of  94.9%  (see  Tables  lb  and  lc).  For  Table 
2,  the  number  of  training  vectors  increased  from  50  per  class  to  64  per  class.  The  results  show  that  the 
SVM  classification  rate  increased  to  96.7%.  The  NN  classification  rates  were  93.3%  and  94.4%  for  the 
three  and  five  neuron  networks,  respectively. 


Trained  64  car  and  64  no  car: 

SVM  -  poly  2  kernel 

Classified  as  a  car 

Classified  as  no  car 

Actual  Car 

23 

2 

Actual  No  Car 

1 

64 

Table  la 

NN  -3  hidden  layer  neurons 

Classified  as  a  car 

Classified  as  no  car 

Actual  Car 

23 

2 

Actual  No  Car 

3 

62 

Table  lb 

NN  -  5  hidden  layer  neurons 

Classified  as  a  car 

Classified  as  no  car 

Actual  Car 

23 

2 

Actual  No  Car 

4 

61 

Table  lc 

Table  2  -  SVM  and  NN  classification  matrix  for  128  training  samples 


The  results  from  this  study  have  shown  an  example  of  SVMs  outperforming  NN  with  a  small  data  set.  The 
SVM  classification  rates  were  slightly  higher  then  the  NNs  rates  for  100  training  samples.  The  SVMs 
classification  rate  increased  as  the  number  of  training  samples  increased.  However,  the  NNs  classification 
rates  stayed  the  same  or  decreased  as  the  training  samples  increased. 


4.  FURTHER  STUDY 


Using  histograms  of  groups  of  horizontal  lines  seems  to  tell  us  that  something  is  in  the  path  of  the  vehicle. 
Problems  will  arise  if  the  vehicle’s  roll  position  is  different  then  the  one  taking  the  pictures;  horizontal  lines 
will  not  be  horizontal.  A  method  to  find  parallel  lines  rather  than  horizontal  lines  should  be  employed. 
Using  templates  is  another  idea  that  could  prove  fruitful  if  it  were  to  be  used  with  the  box-in-box  method. 
The  templates  would  be  used  for  training  (at  different  distances)  and  the  images  would  be  used  for  testing. 
Other  paths  of  investigation  should  include  better  algorithms  for  denoising  the  data.  The  goal  would  be  to 
remove  all  edges  but  that  of  the  car.  Unfortunately,  the  car  is  not  always  the  most  prominent  feature  in  the 
image. 
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